Usage Profile Generation from Web Usage Data Using Hybrid Biclustering Algorithm

نویسندگان

  • R. Rathipriya
  • K. Thangavel
  • J. Bagyamani
چکیده

Biclustering has the potential to make significant contributions in the fields of information retrieval, web mining, and so forth. In this paper, the authors analyze the complex association between users and pages of a web site by using a biclustering algorithm. This method automatically identifies the groups of users that show similar browsing patterns under a specific subset of the pages. In this paper, mutation operator from Genetic Algorithms is incorporated into the Binary Particle Swarm Optimization (BPSO) for biclustering of web usage data. This hybridization can increase the diversity of the population and help the particles effectively escape from the local optimum. It detects optimized user profile group according to coherent browsing behavior. Experiments are performed on a benchmark clickstream dataset to test the effectiveness of the proposed algorithm. The results show that the proposed algorithm has higher performance than existing PSO methods. The interpretation of this biclustering results are useful for marketing and sales strategies. relations among users regarding their browsing interest (Chakraborty & Maka, 2005). From the business and application point of view, knowledge obtained from the Web usage patterns could be directly applied to efficiently manage activities related to e-Business, eCRM, e-Services, e-Education, e-Newspapers, e-Government, Digital Libraries, and so on (Abhraham & Ramos, 2003). Jespersen, Throhauge, and Bach Pedersen (2002) proposed a hybrid approach for analyzing the visitor click stream sequences. DOI: 10.4018/jaec.2011100103 38 International Journal of Applied Evolutionary Computation, 2(4), 37-49, October-December 2011 Copyright © 2011, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. A user profile (Chen & Shahabi, 2001) is a collection of personal information. The information is stored without adding further description or interpreting this information. It represents cognitive skills, intellectual abilities, intention of browsing, browsing styles, preferences and interactions with the pages of specific web sites. User profiling is the process that refers to construction of user profile via the extraction from a set of data and it is a fundamental task in web personalization. In Martín-Bautista, Kraft, Vila, Chen, and Cruz (2004) and Martín-Bautista, Vila, and Escobar-Jeria (2008), two types of profiles are proposed. They are simple profiles which are represented by data extracted from the users’ interest and the extended profiles containing the additional information about the user such as the age, the language level, location and others. Mobasher, Cooley, and Srivastava (1999) and Mobasher, Dai, Luo, Nakagawa, Sun, and Wiltshire (2000) proposed the web personalization system, which consists of offline tasks related to the mining if usage data and online process of automatic Web page customization based on the knowledge discovered. The LumberJack model proposed by Chi, Rosien, and Heer (2002) builds up user profiles by combining both clustering of user sessions and traditional statistical traffic analysis using k–means algorithm. Li (2009) has attempted to provide an up-to-date survey of the rapidly growing area of Web session clustering and analyzed the shortcoming of traditional similarity measurement between web sessions. They proposed a framework of Web session clustering using sequence alignment in computational biology. Lee and Fu (2008) used hierarchical agglomerative clustering to cluster users’ browsing behaviors. In this paper, an improved Two Levels of Prediction Model was presented to achieve higher hit ratio which did not suffer from the heterogeneity user’s behavior. Labroche (2007) proposed a comparison of relational clustering algorithms on web usage data to characterize user access profiles. These methods only rely on numerical values that represents the distance or the dissimilarity between web user sessions to construct web user profiles. In the literature, in order to obtain the user profiles from web usage data, clustering and association rules (Agrawal, Imielinski, & Swami, 1993) are applied frequently. User profiles derived from the clustering results can be utilized to guide strategies of marketing according to the groups (Krishnapuram, Joshi, & Nasraoui, 2001). The association rules discover associations and correlations among items where the presence of an item or group of them in a transaction implies the presences of other items (Agrawal et al., 1993). Association rules are used to identify the relations among visits of users with a certain navigational pattern to the web site. In Alam, Dobbie, and Riddle (2008), swarm intelligence based PSO-clustering algorithm for the clustering of Web user sessions is proposed, in which author claimed that PSO clustering approach performs better than the benchmark k-means clustering algorithm for clustering Web usage sessions. In Rambharose and Nikov (2010), various Computational Intelligence (CI) models such as Fuzzy Systems, Genetic Algorithms, Neural Networks, Artificial Immune Systems, Particle Swarm Optimization, Ant Colony Optimization, Bee Colony Optimization and Wasp Colony Optimization for personalization of interactive web systems are reviewed and compared regarding their inception, functions, performance and application to personalization of interactive web systems. But, PSO was credited with good performance as compared to the other methods. In Premalatha and Natarajan (2010), the modification strategies are proposed in PSO using GA. Experiment results are examined with benchmark functions and results show that the proposed hybrid models outperform the standard PSO. Much research is being done in the area of Web Usage Mining (Martín-Bautista et al., 2004; Pallis, Angelis, & Vakali, 2005) and application of PSO to various fields, based on the goals of the analyst and applications, various algorithms can be applied for cluster analysis. 11 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the product's webpage: www.igi-global.com/article/usage-profile-generation-webusage/61143?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Computer Science, Security, and Information Technology. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=2

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Usage Profile Generation from Web Usage Data Using Hybrid Biclustering Algorithm

Biclustering has the potential to make significant contributions in the fields of information retrieval, web mining, and so forth. In this paper, the authors analyze the complex association between users and pages of a web site by using a biclustering algorithm. This method automatically identifies the groups of users that show similar browsing patterns under a specific subset of the pages. In ...

متن کامل

Usage Profile Generation from Web Usage Data Using Hybrid Biclustering Algorithm

Biclustering has the potential to make significant contributions in the fields of information retrieval, web mining, and so forth. In this paper, the authors analyze the complex association between users and pages of a web site by using a biclustering algorithm. This method automatically identifies the groups of users that show similar browsing patterns under a specific subset of the pages. In ...

متن کامل

Mining Correlated Bicluster from Web Usage Data Using Discrete Firefly Algorithm Based Biclustering Approach

For the past one decade, biclustering has become popular data mining technique not only in the field of biological data analysis but also in other applications like text mining, market data analysis with high-dimensional two-way datasets. Biclustering clusters both rows and columns of a dataset simultaneously, as opposed to traditional clustering which clusters either rows or columns of a datas...

متن کامل

Extraction of Web Usage Profiles using Simulated Annealing Based Biclustering Approach

In this paper, the Simulated Annealing (SA) based biclustering approach is proposed in which SA is used as an optimization tool for biclustering of web usage data to identify the optimal user profile from the given web usage data. Extracted biclusters are consists of correlated users whose usage behaviors are similar across the subset of web pages of a web site where as these users are uncorrel...

متن کامل

Hybrid Swarm Intelligence- Based Biclustering Approach for Recommendation of Web Pages

This chapter focuses on recommender systems based on the coherent user’s browsing patterns. Biclustering approach is used to discover the aggregate usage profiles from the preprocessed Web data. A combination of Discrete Artificial Bees Colony Optimization and Simulated Annealing technique is used for optimizing the aggregate usage profiles from the preprocessed clickstream data. Web page recom...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJAEC

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2011